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VizSEC short papers session: Email archive analysis through graphical visualization 
Wei-Jen Li, Shlomo Hershkop, Salvatore J. Stolfo 

October 2004 Proceedings of the 2004 ACM workshop on Visualization and data 
mining for computer security 

Publisher: ACM Press 

Full text available: ^ pdf(403.73 KB) Additional Information: full citation , abstract , references , index terms 

The analysis of the vast storehouse of email content accumulated or produced by 
individual users has received relatively little attention other than for specific tasks such as 
spam and virus filtering. Current email analysis in standard client applications consists of 
keyword based matching techniques for filtering and expert driven manual exploration of 
email files. 

We have implemented a tool, called the Email Mining Toolkit (EMT) for analyzing email 
archives which includes a graphic ... 

Keywords: email, spam, virus 



2 Research track paper: Combining email models for false positive reduction 
Shlomo Hershkop, Salvatore J. Stolfo 

August 2005 Proceeding of the eleventh ACM SIGKDD international conference on 
Knowledge discovery in data mining KDD '05 

Publisher: ACM Press 

Full text available: ^| pdf(485.01 KB) Additional Information: full citation , abstract , references , index terms 

Machine learning and data mining can be effectively used to model, classify and discover 
interesting information for a wide variety of data including email. The Email Mining 
Toolkit, EMT, has been designed to provide a wide range of analyses for arbitrary email 
sources. Depending upon the task, one can usually achieve very high accuracy, but with 
some amount of false positive tradeoff. Generally false positives are prohibitively 
expensive in the real world. In the case of spam detection, for exa ... 

Keywords: aggregators, data mining, email mining, false positive reduction, model 
combination, multiple classifiers, spam 



3 g 

Industry / government track papers: Learning to detect malicious executables in the 



http://portal.acm.org/results.cfm7CFIDM01 8 1 &CFTOKEN=73635 1 70&adv=l &COLL=AC... 6/22/06 



Results (page 1): + virus +n-grams 



Page 2 of 4 



wild 

Jeremy Z. Kolter, Marcus A. Maloof 

August 2004 Proceedings of the tenth ACM SIGKDD international conference on 
Knowledge discovery and data mining KDD '04 

Publisher: ACM Press 

Full text available: Q pdf(2 16.52 KB) Additional Information: full citation , abstract , references , index terms 

In this paper, we describe the development of a fielded application for detecting malicious 
executables in the wild. We gathered 1971 benign and 1651 malicious executables and 
encoded each as a training example using n-grams of byte codes as features. Such 
processing resulted in more than 255 million distinct n-grams. After selecting the'most 
relevant n-grams for prediction, we evaluated a variety of inductive methods, including 
naive Bayes, decision trees, support vector machines, and boosting. ... 

Keywords: concept learning, data mining, malicious software, security 



4 Cross-language information retrieval: Translating unknown queries with web corpora Q 
for cross-language information retrieval 

Pu-Jen Cheng, Jei-Wen Teng, Ruei-Cheng Chen, Jenq-Haur Wang, Wen-Hsiang Lu, Lee-Feng 
Chien 

July 2004 Proceedings of the 27th annual international ACM SIGIR conference on 
Research and development in information retrieval SIGIR '04 

Publisher: ACM Press 

Full text available: ^| pdf(387.08 KB) Additional Information: full citation , abstract , references , index terms 

It is crucial for cross-language information retrieval (CLIR) systems to deal with the 
translation of unknown queries due to that real queries might be short. The purpose of 
this paper is to investigate the feasibility of exploiting the Web as the corpus source to 
translate unknown queries for CLIR. We propose an online translation approach to 
determine effective translations for unknown query terms via mining of bilingual search- 
result pages obtained from Web search engines. This approach can a ... 

Keywords: cross-language information retrieval, cross-language web search, query 
translation 




5 Mobile code: Anomaly intrusion detection in dynamic execution environments 
Hajime Inoue, Stephanie Forrest 

September 2002 Proceedings of the 2002 workshop on New security paradigms 
Publisher: ACM Press 

Full text available: ^| pdf(867.25 KB) Additional Information: full citation , abstract , references , index terms 

We describe an anomaly intrusion-detection system for platforms that incorporate 
dynamic compilation and profiling. We call this approach "dynamic sandboxing." By 
gathering information about applications' behavior usually unavailable to other anomaly 
intrusion-detection systems, dynamic sandboxing is able to detect anomalies at the 
application layer. We show our implementation in a Java Virtual Machine is both effective 
and efficient at stopping a backdoor and a virus, and has a low false positi ... 

Keywords: Java, anomaly detection, dynamic sandboxing 
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rrest, Stever 

Communications of the ACM, volume 40 issue 10 



^ Stephanie Forrest, Steven A. Hofmeyr, Anil Somayaji 
October 1997 Com 
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Full text available: ^ pdf(460.66 KB) Additional Information: full citation , references , citings , index terms 



Greedy decoding for statistical machine translation in almost linear time 
Ulrich Germann 

May 2003 Proceedings of the 2003 Conference of the North American Chapter of the 
Association for Computational Linguistics on Human Language Technology 
- Volume 1 NAACL '03 

Publisher: Association for Computational Linguistics 

Full text available: ^|pdf(1 94.88 KB) Additional Information: full citation , abstract , references 

We present improvements to a greedy decoding algorithm for statistical machine 
translation that reduce its time complexity from at least cubic (0(n 6 ) when applied 
naively) to practically linear time 1 without sacrificing translation quality. We achieve this 
by integrating hypothesis evaluation into hypothesis creation, tiling improvements over 
the translation hypothesis at the end of each search iteration, and by imposing 
restrictions on the amount ... 

Special section on data mining for intrusion detection and threat analysis: Data 
mining-based intrusion detectors: an overview of the Columbia IDS project 
Salvatore J. Stolfo, Wenke Lee, Philip K. Chan, Wei Fan, Eleazar Eskin 
December 2001 ACM SIGMOD Record, Volume 30 issue 4 

Publisher: ACM Press 

Full text available: ^ pdf(1 .05 MB) Additional Information: full citation , references , citings , index terms 



Mining semantically related terms from biomedical literature 
Goran Nenadic, Sophia Ananiadou 

March 2006 ACM Transactions on Asian Language Information Processing (TALIP), 

Volume 5 Issue 1 
Publisher: ACM Press 

Full text available: ^ pdf(1.40 MB) Additional Information: full citation , abstract , references , index terms 

Discovering links and relationships is one of the main challenges in biomedical research, 
as scientists are interested in uncovering entities that have similar functions, take part in 
the same processes, or are coregulated. This article discusses the extraction of such 
semantically related entities (represented by domain terms) from biomedical literature. 
The method combines various text-based aspects, such as lexical, syntactic, and 
contextual similarities between terms. Lexical similarities ar ... 

Keywords: biomedical literature, contextual patterns, term similarities, text mining 



10 Chinese text retrieval without using a dictionary 

Aitao Chen, Jianzhang He, Liangjie Xu, Fredric C. Gey, Jason Meggs 
>^ July 1997 ACM SIGIR Forum , Proceedings of the 20th annual international ACM 

SIGIR conference on Research and development in information retrieval 
SIGIR '97, Volume 31 Issue SI 
Publisher: ACM Press 

Full text available: ^ pdf(1.37 MB) Additional Information: full citation , references , citings , index terms 
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>^ Gonzalo Navarro 



March 2001 ACM Computing Surveys (CSUR), volume 33 issue l 
Publisher: ACM Press 

i- ii* ^ -i ui a ^ziA *rs hjiox Additional Information: full citation , abstract , references , citings , index 

Full text available: Tfl pdf(1.19 MB) :' 5 *-' 

^ terms , review 

We survey the current techniques to cope with the problem of string matching that allows 
errors. This is becoming a more and more relevant issue for many fast growing areas 
such as information retrieval and computational biology. We focus on online searching and 
mostly on edit distance, explaining the problem and its relevance, its statistical behavior, 
its history and current developments, and the central ideas of the algorithms and their 
complexities. We present a number of experiments to ... 

Keywords: Levenshtein distance, edit distance, online string matching, text searching 
allowing errors 



12 Instrusion detection: The role of suspicion in model-based intrusion detection 

❖ Timothy Hollebeek, Rand Waltzman 
September 2004 Proceedings of the 2004 workshop on New security paradigms 

Publisher: ACM Press 

Full text available: ^ pdfd 16.29 KB) Additional Information: full citation , abstract , references 

We argue in favor of the explicit inclusion of suspicion as a concrete concept to be used in 
the analysis of audit data in order to guide the search for evidence of misuse. Our 
approach is similar to that of a human forensic analyst, who first notices details that seem 
slightly odd, and then investigates further and cross checks information in an attempt to 
build a coherent explanation for the observed details. We use deductive reasoning 
combined with expert knowledge about system behavior, pote ... 

13 Part-of-speech induction from scratch 
Hinrich Schutze 

June 1993 Proceedings of the 31st annual meeting on Association for Computational 
Linguistics 

Publisher: Association for Computational Linguistics 

Full text available: ffl odf(71 7.90 KB) r . £ „ A . u A ^ , 

JS Additional Information: full citation , abstract , references , citings 

§y Publisher Site 

This paper presents a method for inducing the parts of speech of a language and part-of- 
speech labels for individual words from a large text corpus. Vector representations for the 
part-of-speech of a word are formed from entries of its near lexical neighbors. A 
dimensionality reduction creates a space representing the syntactic categories of 
unambiguous words. A neural net trained on these spatial representations classifies 
individual contexts of occurrence of ambiguous words. The method classif ... 
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1 Microcomputers in educational and research environments: their management, 

acquisition , up grade, and maintenance 
^ Jean F. Coppola, Francis T. Marchese 

December 1992 Proceedings of the 20th annual ACM SZGUCCS conference on User 

services 
Publisher: ACM Press 

Full text available: ^ pdf(770.75 KB) Additional Information: full citation , index terms 



2 Level II technical support in a distributed computing environment 
A Tim Leehane 

^ September 1996 Proceedings of the 24th annual ACM SIGUCCS conference on User 
services 
Publisher: ACM Press 

Full text available: ^ pdf(5.73 MB) Additional Information: full citation , references , index terms 



3 Building a help desk from scratch, with no staff, no equipment and no money: 




molding novice student consultants into seasoned help desk operators 
Carol L. Smith 

September 1996 Proceedings of the 24th annual ACM SIGUCCS conference on User 

services 
Publisher: ACM Press 

Full text available: ^ pdf(437.94 KB) Additional Information: full citation , references , index terms 




A reliable multicast framework for light-weight sessions and application level framing 
Sally Floyd, Van Jacobson, Ching-Gung Liu, Steven McCanne, Lixia Zhang 
December 1997 IEEE/ACM Transactions on Networking (TON), volume 5 issue 6 

Publisher: IEEE Press 

I-..* * i ui » wff-j-iftTii i/d\ Additional Information: full citation , references , citings , index terms . 

Full text available: TM pdf(310.74 KB) — ; 

^ review 
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A bit of viral protection is worth a megabyte of cure 
Tim Fitzgerald 

June 1995 ACM SIGUCCS Newsletter, volume 25 issue 1-2 
Publisher: ACM Press 

Full text available: pdf(427.33 KB) Additional Information: full citation , abstract , index terms 

Even in today's world of safeguarded networks and advanced detection software, 
computer viruses are still running amok in some of the seedier niches of cyberspace and 
hiding out on unclean disks and unprotected hard drives. Speculative rumors of wide- 
spread epidemics have only added to the confusion as computer users all over the world 
wonder if their systems are at risk and if there is any way to shield themselves from these 
stealth operatives of electronic malfeasance. 

The military impact of information technology 

Jeff Johnson, Ronald L. Davis, Roger W. Wester, Frank Exner, Crispin Cowan, Mayur Patel, 
Michael Lingle, Barry Goldstein, James K. Yun, Carey Nachenberg 
April 1997 Communications of the ACM, volume 40 issue 4 

Publisher: ACM Press 

Full text available:^] pdfd 30.61 KB) Additional Information: full citation , index terms 



Where the students are ...computing services at the customer source 
Glenda E. Mourn 

October 2000 Proceedings of the 28th annual ACM SIGUCCS conference on User 
services: Building the future 

Publisher: ACM Press 

Full text available: *g] pdfd 06.36 KB) Additional Information: full citation , index terms 



Keywords: consulting, hardware repair, software sales, student services 



8 The evolving support toolbox or redistributed support 
John Hawkins 

September 1991 Proceedings of the 19th annual ACM SIGUCCS conference on User 

services 
Publisher: ACM Press 

Full text available: ^ pdf(594.77 KB) Additional Information: full citation , index terms 




9 Book Excerpt: Computer Ethics, Second Edition by Deborah G. Johnson (Prentice 

% Hall. 1994) 

^ Deborah G. Johnson 

December 1993 ACM SIGCAS Computers and Society, Volume 23 issue 3-4 

Publisher: ACM Press 

Full text available: ^ pdf(581.08 KB) Additional Information: full citation 
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10 Bucknell's software service clinic — meeting a computing support challenge 
Karen Kniss 

V 7 November 1997 Proceedings of the 25th annual ACM SIGUCCS conference on User 
services: are you ready? 
Publisher: ACM Press 

Full text available: ^ pdf(922.20 KB) Additional Information: full citation , index terms 



11 Staying Connected: Body of technology 
Meg McGinity 

September 2000 Communications of the ACM, Volume 43 issue 9 
Publisher: ACM Press 

Full text available:^ pdf(72.49 KB) M . 4 . . . . 

g ; .,7. ~ > t ,L Additional Information: full citation , citings , index terms 
m htmlf 11.64 KB) 



12 The costly implications of consulting in a virus-infected computer environment 
A K. Nunez, T. Gerace, A. Hartman 

^ October 1989 Proceedings of the 17th annual ACM SIGUCCS conference on User 
Services 
Publisher: ACM Press 

Full text available: ^] pdf(468.7Q KB) Additional Information: full citation , index terms 



13 WORKBENCH— the financial benefit and savings to the university SI 
Angelo Montovino 

October 2000 Proceedings of the 28th annual ACM SIGUCCS conference on User 

services: Building the future 
Publisher: ACM Press 

Full text available: ^ pdf(78.76 KB) Additional Information: full citation , index terms 




14 Illustrative risks to the public in the use of computer systems and related technolo g y 
Peter G. Neumann 

January 1996 ACM SIGSOFT Software Engineering Notes, Volume 21 issue l 
Publisher: ACM Press 

Full text available: Q pdf(2.54 MB) Additional Information: full citation 




15 Semi-automatic recognition of noun modifier relationshi ps | 
Ken Barker, Stan Szpakowicz 

August 1998 Proceedings of the 17th international conference on Computational 
linguistics - Volume 1 , Proceedings of the 36th annual meeting on 
Association for Computational Linguistics - Volume 1 

Publisher: Association for Computational Linguistics , Association for Computational Linguistics 
pdf(626.02 KB) 

^Additional Information: full citation , abstract , references 
Publisher Site 

Semantic relationships among words and phrases are often marked by explicit syntactic or 
lexical clues that help recognize such relationships in texts. Within complex nominals, 
however, few overt clues are available. Systems that analyze such nominals must 
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compensate for the lack of surface clues with other information. One way is to load the 
system with lexical semantics for nouns or adjectives. This merely shifts the problem 
elsewhere: how do we define the lexical semantics and build large sem ... 

16 Mobile computing at Rensselaer Polytechnic Institute Q 
Patrick Valiquette, Mark Miller, Ed Seeger 

October 2000 Proceedings of the 28th annual ACM SIGUCCS conference on User 
services: Building the future 

Publisher: ACM Press 

Full text available: *^ pdf(135.14 KB) Additional Information: full citation , index terms 




Keywords: RPI, Rensselaer Polytechnic Institute, ThinkPad, laptop comouters, mobile 
computing 



17 Manage all the computer labs on campus? what did I do to deserve this? 
Kathy DuBose 

October 2000 Proceedings of the 28th annual ACM SIGUCCS conference on User 
services: Building the future 

Publisher: ACM Press 

Full text available: pdf(144.13 KB) Additional Information: full citation , index terms 



Keywords: computer, lab, manager, support 



18 Creating a technology desk in an information commons 
Susan Hales, Don Rea, Marcella Siegler 

October 2000 Proceedings of the 28th annual ACM SIGUCCS conference on User 
services: Building the future 

Publisher: ACM Press 

Full text available: ^pdfd 36.62 KB) Additional Information: full citation , citings , index terms 



19 Incremental cryptography and application to virus protection Q 
Mihir Bellare, Oded Goldreich, Shafi Goldwasser 

May 1995 Proceedings of the twenty-seventh annual ACM symposium on Theory of 
computing 

Publisher: ACM Press 

Full text available: ^| pdf(1.65 MB) Additional Information: full citation , references , citings , index terms 




20 A taxonomy of computer program security flaws 

Carl E. Landwehr, Alan R. Bull, John P. McDermott, William S. Choi 
>^ September 1994 ACM Computing Surveys (CSUR), Volume 26 issue 3 

Publisher: ACM Press 

.- .. * ^ i ui 0 o <i k iin\ Additional Information: full citation , abstract , references , citings , index 

Full text available: tj u pdf(3.81 MB) : — 

terms , review 

An organized record of actual flaws can be useful to computer system designers, 
programmers, analysts, administrators, and users. This survey provides a taxonomy for 
computer program security flaws, with an Appendix that documents 50 actual security 
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flaws. These flaws have all been described previously in the open literature, but in widely 
separated places. For those new to the field of computer security, they provide a good 
introduction to the characteristics of security flaws and how they ... 

Keywords: error/defect classification, security flaw, taxonomy 
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